Skip to content

Comments

Fix: Prevent crashes when HuggingFace is unreachable#152

Open
mpecanha wants to merge 1 commit intojamiepine:mainfrom
mpecanha:fix-offline-mode-crash
Open

Fix: Prevent crashes when HuggingFace is unreachable#152
mpecanha wants to merge 1 commit intojamiepine:mainfrom
mpecanha:fix-offline-mode-crash

Conversation

@mpecanha
Copy link

Voicebox Offline Mode Fix

Problem

Voicebox crashes when generating speech if HuggingFace is unreachable, even when models are fully cached locally.

Root Cause:

  • Voicebox downloads mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 (MLX optimized version)
  • But mlx_audio.tts.load() tries to fetch config.json from original repo Qwen/Qwen3-TTS-12Hz-1.7B-Base
  • This network request fails → server crashes with RemoteDisconnected

Related Issues:

Solution

Two-part fix:

1. Monkey-patch huggingface_hub (backend/utils/hf_offline_patch.py)

  • Intercepts cache lookup functions
  • Forces offline mode early (before mlx_audio imports)
  • Adds debug logging for cache hits/misses

2. Symlink original repo to MLX version (ensure_original_qwen_config_cached())

  • When original Qwen/Qwen3-TTS-12Hz-1.7B-Base cache doesn't exist
  • But MLX mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 does exist
  • Creates a symlink so cache lookups succeed

Files Changed

  • backend/backends/mlx_backend.py - Added patch imports at top
  • backend/utils/hf_offline_patch.py - New patch module

Testing

To test this fix:

  1. Build Voicebox from source: make build
  2. Disconnect from internet
  3. Try generating speech
  4. Should work without network requests

Build Instructions

# Install dependencies
pip install -r requirements.txt

# Build the app
make build

# Or build just the server
make build-server

Notes

  • The patch is applied automatically when mlx_backend.py is imported
  • Set VOICEBOX_OFFLINE_PATCH=0 to disable the patch
  • The symlink approach works because the config.json is compatible between versions

Patch contributed by community

Implements offline mode patch for API stability issues:

- Add hf_offline_patch.py to monkey-patch huggingface_hub
- Force cache-only lookups before mlx_audio imports
- Create symlink from original Qwen repo to MLX community version
  when only MLX version is cached

This fixes:
- Issue jamiepine#150: Internet required even with cached models
- Issue jamiepine#151: API crashes when HF network fails

The patch ensures that if models are locally cached, no network
requests are made to HuggingFace during speech generation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant